On the Reachability of Trustworthy Information from Integrated Exploratory Biological Queries
نویسندگان
چکیده
Levels of curation across biological databases are widely recognized as being highly variable, depending on provenance and type. In spite of ambiguous quality, searches against biological sources, such as those for sequence homology, remain a frontline strategy for biomedical scientists studying molecular data. In the following, we investigate the accessibility of well-curated data retrieved from explorative queries across multiple sources. We present the architecture and design of a lightweight data integration platform conducible to graph-theoretic analysis. Using data collected via this framework, we examine the reachability of evidence-supported annotations across triangulated sources in the face of uncertainty, using a simple random sampling model oriented around fault tolerance. We characterize the accessibility of high-quality data from uncertain queries and levels of redundancy across data sources and find that generally encountering non-experimentally verified annotations are nearly as likely as encountering experimentally verified annotations, with the exception of a group of proteins whose link structure is dominated by experimental evidence. Finally, we discuss the prospect of determining overall accessibility of relevant information based on metadata about a query and its results.
منابع مشابه
Towards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore
Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...
متن کاملDeveloping a BIM-based Spatial Ontology for Semantic Querying of 3D Property Information
With the growing dominance of complex and multi-level urban structures, current cadastral systems, which are often developed based on 2D representations, are not capable of providing unambiguous spatial information about urban properties. Therefore, the concept of 3D cadastre is proposed to support 3D digital representation of land and properties and facilitate the communication of legal owners...
متن کاملGenomic pathways database and biological data management.
In this paper, we discuss the properties of biological data and challenges it poses for data management, and argue that, in order to meet the data management requirements for 'digital biology', careful integration of the existing technologies and the development of new data management techniques for biological data are needed. Based on this premise, we present PathCase: Case Pathways Database S...
متن کاملEvaluating Reachability Queries over Path Collections
Several applications in areas such as biochemistry, GIS, involve storing and querying large volumes of sequential data stored as path collections. There is a number of interesting queries that can be posed on such data. This work focuses on reachability queries: given a path collection and two nodes vs, vt, determine whether a path from vs to vt exists and identify it. To answer these queries, ...
متن کاملIntegrated Management Model for Pre-hospital and Hospital Emergency Rooms in Iran
Background and Objective:The need for medical emergencies is a shadow that accompanies humans and may occur for any individual under any circumstances. Emergency care services systems in different countries are completely different and very complex and dispersed, have false financial incentives, and waiting times are rising and are among the most important emergency challenges. In this regard, ...
متن کامل